145 research outputs found

    Terrain guided multi-level instancing of highly complex plant populations

    Get PDF

    XEngine : Optimal Tensor Rematerialization for Neural Networks in Heterogeneous Environments

    Get PDF
    Memory efficiency is crucial in training deep learning networks on resource-restricted devices. During backpropagation, forward tensors are used to calculate gradients. Despite the option of keeping those dependencies in memory until they are reused in backpropagation, some forward tensors can be discarded and recomputed later from saved tensors, so-called checkpoints. This allows, in particular, for resource-constrained heterogeneous environments to make use of all available compute devices. Unfortunately, the definition of these checkpoints is a non-trivial problem and poses a challenge to the programmer—improper or excessive recomputations negate the benefit of checkpointing. In this article, we present XEngine, an approach that schedules network operators to heterogeneous devices in low memory environments by determining checkpoints and recomputations of tensors. Our approach selects suitable resources per timestep and operator and optimizes the end-to-end time for neural networks taking the memory limitation of each device into account. For this, we formulate a mixed-integer quadratic program (MIQP) to schedule operators of deep learning networks on heterogeneous systems. We compare our MIQP solver XEngine against Checkmate [12], a mixed-integer linear programming (MILP) approach that solves recomputation on a single device. Our solver finds solutions that are up to 22.5% faster than the fastest Checkmate schedule in which the network is computed exclusively on a single device. We also find valid schedules for networks making use of both central processing units and graphics processing units if memory limitations do not allow scheduling exclusively to the graphics processing unit

    Isotropic clustering for hierarchical radiosity - implementation and experiences

    Get PDF
    Although Hierarchical Radiosity was a big step forward for finite element computations in the context of global illumination, the algorithm can hardly cope with scenes of more than medium complexity. The reason is that Hierarchical Radiosity requires an initial linking step, comparing all pairs of initial objects in the scene. These initial objects are then hierarchically subdivided in order to accurately represent the light transport between them. Isotropic Clustering, as introduced by Sillion, in addition creates a hierarchy above the input objects. Thus, it allows for the interaction of complete clusters of objects and avoids the costly initial linking step. In this paper, we describe our implementation of the isotropic clustering algorithm and discuss some of the problems that we encountered. The complexity of the algorithm is examined and clustering strategies are compared

    A Quality-Centered Analysis of Eye Tracking Data in Foveated Rendering

    Get PDF
    This work presents the analysis of data recorded by an eye tracking device in the course of evaluating a foveated rendering approach for head-mounted displays (HMDs). Foveated ren- dering methods adapt the image synthesis process to the user’s gaze and exploiting the human visual system’s limitations to increase rendering performance. Especially, foveated rendering has great potential when certain requirements have to be fulfilled, like low-latency rendering to cope with high display refresh rates. This is crucial for virtual reality (VR), as a high level of immersion, which can only be achieved with high rendering performance and also helps to reduce nausea, is an important factor in this field. We put things in context by first providing basic information about our rendering system, followed by a description of the user study and the collected data. This data stems from fixation tasks that subjects had to perform while being shown fly-through sequences of virtual scenes on an HMD. These fixation tasks consisted of a combination of various scenes and fixation modes. Besides static fixation targets, moving tar- gets on randomized paths as well as a free focus mode were tested. Using this data, we estimate the precision of the utilized eye tracker and analyze the participants’ accuracy in focusing the displayed fixation targets. Here, we also take a look at eccentricity-dependent quality ratings. Comparing this information with the users’ quality ratings given for the displayed sequences then reveals an interesting connection between fixation modes, fixation accuracy and quality ratings

    Parallel Multi-Hypothesis Algorithm for Criticality Estimation in Traffic and Collision Avoidance

    Full text link
    Due to the current developments towards autonomous driving and vehicle active safety, there is an increasing necessity for algorithms that are able to perform complex criticality predictions in real-time. Being able to process multi-object traffic scenarios aids the implementation of a variety of automotive applications such as driver assistance systems for collision prevention and mitigation as well as fall-back systems for autonomous vehicles. We present a fully model-based algorithm with a parallelizable architecture. The proposed algorithm can evaluate the criticality of complex, multi-modal (vehicles and pedestrians) traffic scenarios by simulating millions of trajectory combinations and detecting collisions between objects. The algorithm is able to estimate upcoming criticality at very early stages, demonstrating its potential for vehicle safety-systems and autonomous driving applications. An implementation on an embedded system in a test vehicle proves in a prototypical manner the compatibility of the algorithm with the hardware possibilities of modern cars. For a complex traffic scenario with 11 dynamic objects, more than 86 million pose combinations are evaluated in 21 ms on the GPU of a Drive PX~2

    Efficient Caustic Rendering with Lightweight Photon Mapping

    Get PDF
    Robust and efficient rendering of complex lighting effects, such as caustics, remains a challenging task. While algorithms like vertex connection and merging can render such effects robustly, their significant overhead over a simple path tracer is not always justified and – as we show in this paper ‐ also not necessary. In current rendering solutions, caustics often require the user to enable a specialized algorithm, usually a photon mapper, and hand‐tune its parameters. But even with carefully chosen parameters, photon mapping may still trace many photons that the path tracer could sample well enough, or, even worse, that are not visible at all. Our goal is robust, yet lightweight, caustics rendering. To that end, we propose a technique to identify and focus computation on the photon paths that offer significant variance reduction over samples from a path tracer. We apply this technique in a rendering solution combining path tracing and photon mapping. The photon emission is automatically guided towards regions where the photons are useful, i.e., provide substantial variance reduction for the currently rendered image. Our method achieves better photon densities with fewer light paths (and thus photons) than emission guiding approaches based on visual importance. In addition, we automatically determine an appropriate number of photons for a given scene, and the algorithm gracefully degenerates to pure path tracing for scenes that do not benefit from photon mapping
    • 

    corecore